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(g) Field programmable digital signal processing array integrated circuit 



(57) A field programmable, digital signal proces- 
sing integrated circuit is formed in a semicon- 
ductor die and includes an array of arithmetic 
logic (ALU) circuits. A user programmable inter- 
connect architecture is superimposed on the 
array of ALU circuits. One or more interface 
circuits comprising digital-to-analog (D/A) con- 
verters or analog-to-digital (AID) converters are 
provided on the integrated circuit to interface to 
off-chip analog input signals and provide off- 
chip analog output signals. Circuitry is provided 
to program the interconnections between the 
interface circuits and the ALU circuits and be- 
tween individual ones of the ALU circuits, as 
well as to define the specific functions of the 
individual ALU circuits. 
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The present invention relates to integrated cir- 
cuits and, more specifically, to user-programmable 
integrated circuits. More particularly, the present in- 
vention relates to user-programmable mixed analog 
and digital integrated circuits. 

General purpose linear integrated circuits have 
limited themselves to specific functions such as op- 
erational amplifiers, phase locked loops, compara- 
tors, A/D converters, video amplifiers, transistor ar- 
rays, etc. These circuits form the building blocks of 
analog systems. Integrating these circuits into higher 
functions is difficult due to the need to employ exter- 
nal components (i.e., resistors, capacitors, inductors, 
etc.) to determine their exact function. Thus once in- 
tegrated, these circuits become specialized. In order 
to be practicable for design, manufacture, and sale, 
such a specialized part must have a large usage 
base. One illustrative example of suchsa circuit is an 
audio amplifier which may be used in stereo systems 
or television sets. Without a large usage base, the de- 
sign and manufacture of such a circuit is not econom- 
ical. 

During the manufacture of analog circuits a sig- 
nificant cost is the final trimming of each individual 
circuit. This is required because component values 
vary as well as stray capacitance due to component 
placement.. 

Another common problem in electronics is that 
various parts of a complex signal need to be kept in 
phase while utilizing different circuit paths. This is 
commonly done in color television sets where the lu- 
minance information is routed through a delay line 
while the chrominance information is processed. 

Furthermore, many common analog systems 
such as television sets, VCR's and stereo systems 
are currently employing many digital functions. 
Hence the integration of these circuits onto a single 
integrated circuit die requires mixed analog and digital 
design and a process to manufacture them. The man- 
ufacturing process for such integrated circuits is com- 
plex and expensive as the transistors used in analog 
circuit design are typically radically different from 
those used in digital design. 

One prior-art approach to this problem has been 
to design circuits which operate employing digital sig- 
nal processing (DSP) techniques. These devices use 
a microprocessor core to simulate the mathematical 
equivalent of the analog system. One common appli- 
cation for such chips is digital filtering of the signals 
before conversion back to the analog world. 

A fundamental limitation of these integrated DSP 
devices is that the device speed is limited by the Von 
Neuman architecture of the microprocessor where 
many processor functions are required, for each time 
slice of the analog signal. This limitation has hereto- 
fore limited the speed of such devices to frequencies 
in the audio spectrum. This is of course due to the fact 
that the customization of the function is achieved by 



the coding of the instructions in the microprocessor. 

Another approach to the mixed analog and digital 
integrated circuit is shown in United States Patent No. 
5,107,146 to El-Ayat. This patent discloses a user- 
5 programmable architecture including a mixture of 
analog and digital circuit modules. The digital logic 
modules are of the type used in prior FPGA devices. 

Work has been done in the past in trying to de- 
sign processor arrays to speed up applications. 
10 These class of machines are called MIMD or SIMD 
(multiple instruction multiple data or single instruction 
multiple data). These concepts utilize a plurality of 
processor engines to perform logical operations, 
such as multiplication and division. Each processor 
is engine is a Von Neuman machine and occupies sig- 
nificant die area on an integrated circuit. 

It is an object of the present invention to provide 
a user-programmable digital signal processing inte- 
grated circuit which overcomes the limitations of the 
20 prior art 

It is another object of the present invention to pro- 
vide a user-programmable digital signal processing . 
integrated circuit which does not require individual 
trimming to maximize performance. 

25 Yet another object of the present invention is to 

provide a user-programmable digital signal process- 
ing integrated circuit which allows the user to control 
phase shifting of signals being processed therein. 
It is a further object of the present invention to 

30 provide a user-programmable digital signal process- 
ing integrated circuit which may be easily program- 
med by a user. 

BRIEF DESCRIPTION OF THE INVENTION 

35 

According to the present invention, a field pro- 
grammable, digital signal processing integrated cir- 
cuit is formed in a semiconductor die and includes an 
array of arithmetic logic (ALU) circuits. A user pro- 

40 grammable interconnect architecture is superim- 
posed on the array of ALU circuits. One or more in- 
terface circuits comprising digital-to-analog (D/A) 
converters or analog-to-digital (A/D) converters are 
provided on (or off) the integrated circuit to interface 

45 to off-chip analog input signals and provide off-chip 
analog output signals. Other functional circuit blocks, 
such as programmable read only memory (PROM) or 
Read Only Memory (RAM) circuits may also be dis- 
posed on the integrated circuit die. Circuitry is provid- 

50 ed to program the interconnections between the inter- 
face circuits and the ALU circuits and between indi- 
vidual ones of the ALU circuits, as well as to define 
the specific functions of the individual ALU circuits. 
The architecture of the present invention avoids 

55 the Von Neuman bottle neck characteristic of prior art 
systems by eliminating the need for sequential in- 
structions. Each ALU circuit of the present invention 
may be user customized to act like the mathematical 
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equivalent of an analog circuit element The individual 
ALU circuits are interconnected to one another and to 
A/D and D/A interface circuits by user- programmable 
interconnect elements. 

The present invention will be further described 
hereinafter with reference to the following description 
of exemplary embodiments and the accompanying 
drawings, in which: 

FIG. 1 is a block diagram of the architecture for 
an illustrative f ield programmable digital signal proc- 
essing integrated circuit according to a preferred em- 
bodiment of the present invention. 

FIG. 2a is a block diagram of an illustrative ALU 
circuit suitable for inclusion in the f ield programmable 
digital signal processing integrated circuit according 
to the present invention. 

FIG. 2b is a state diagram which discloses in de- 
tail the operation of the control circuit portion of the 
ALU circuit of FIG. 2a. 

FIG. 3 is a schematic diagram of an architecture 
for a bus interchange which can perform a single or 
multibit shift operation. 

FIG. 4a is a schematic diagram of a simple invert- 
ing analog amplifier. 

FIG. 4b is an equivalent block diagram of the am- 
plifier of FIG. 4a implemented according to the pres- 
ent invention. 

FIG. 4c is an equivalent block diagram of the am- 
plifier of FIG. 4a implemented according to the pres- 
ent invention and including a logarithmic feedback 
element 

FIG. 5 is a graph showing the waveforms of the 
signal input and signal output waveform of the circuit 
of FIG. 4b for a sinusoidal input waveform. 

FIG. 6 is a graph showing the waveforms of the 
' signal input and signal output waveform of the circuit 
of FIG. 4b for a square input waveform. 

FIG. 7a is a schematic diagram of a variation of 
the amplifier circuit of FIG. 4a. 

FIG. 7b is an equivalent block diagram of the am- 
plifier of FIG. 7a implemented according to the pres- 
ent invention in a manner which avoids pipelining dis- 
tortion in the output. 

FIG. 8 is a graph showing the input and output vol- 
tages of the circuit of FIG. 7b for 1 MHz sine wave in- 
put. 

FIG. 9 is a graph showing the input and output vol- 
tages of the circuit of FIG. 7 b for 1 MHz square wave 
input 

FIG. 10 is a block diagram of an illustrative ana- 
log shift register configured using the architecture of 
the present invention. 

FIGS. 11a and 11 b are examples of a series RLC 
tuned circuit implemented according to the present in- 
vention. 

Those of ordinary skill in the art will realize that 
the following description of the present invention is il- 
lustrative only and not in any way limiting. Other em- 



bodiments of the invention will readily suggest them- 
selves to such skilled persons. 

Referring first to FIG. 1, a block diagram is pre- 
sented of the architecture for an illustrative field pro- 
s grammable digital signal processing integrated circuit 
according to a preferred embodiment of the present 
invention. The architecture of the present invention is 
integrated on a single piece of semiconductor mate- 
rial, and may be fabricated using known semiconduc- 
10 tor "processing technology, such as CMOS technolo- 
gy, which is presently preferred. 

The f ield programmable digital signal processing 
integrated circuit 10 of the present invention is built 
around an array of arithmetic logic (ALU) circuits 
is shown at reference numerals 1 2-1 through 1 2-9. For 
purposes of illustration, arithmetic logic units 12-1 
through 12-9 are shown arranged as a regular array 
comprising three rows and three columns of ALU cir- 
cuits. Those of ordinary skill in the art will readily ob- 
20 serve that the architecture and arrangement of FIG. 
1 is illustrative only and not limiting, in that such skil- 
led persons will readily recognize that other numbers 
of ALU circuits and other layout arrangements may be 
employed. 

25 At least one analog to digital (A/D) converter and 

at least one digital to analog (D/A) converter circuit 
may be optionally disposed on the integrated circuit 
along with the ALU circuits. In the illustrative embodi- 
ment of FIG. 1 , two A/D circuits 1 4-1 and 14-2 and two 
30 D/A circuits 16-1 and 16-2 are shown. In an actual 
embodiment of the integrated architecture of the 
present invention, A/D converters 14-1 and 14-2 and 
D/A converters 16-1 and 16-2 will probably be located 
near the periphery of the integrated circuit die upon 
35 which the architecture 10 of the present invention is 
disposed, but those of ordinary skill in the art will un- 
derstand that placement of these devices is largely a 
matter of design choice. Such elements may even be 
located off chip in certain applications. 
40 As in any integrated circuit, a plurality of in- 

put/output (I/O) pins are provided for supplying power 
to the integrated circuit and for transporting electrical 
signals onto and off of the integrated circuit The num- 
ber of I/O pins provided on any actual embodiment of 
45 the architecture of the present invention will be purely 
a matter of design choice. A group of such I/O pins is 
depicted as a single I/O block 18. but those of ordi- 
nary skill in the art will recognize that I/O block 1 8 rep- 
resents a plurality of I/O pins, 
so Other functional circuit blocks may be disposed 

in the integrated circuit along with the other previous- 
ly-described elements. For example, and as shown in 
FIG. 1, PROM devices 20-1 and 20-2 are shown dis- 
posed in the integrated circuit architecture 10 of the 
55 present invention. Those of ordinary skill in the art will 
realize that other types of circuit elements, such as 
RAM and ROM circuits, may be usefully employed in • 
the architecture of the present invention. 
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Finally, a user-programmable interconnect archi- 
tecture is superimposed upon the aforementioned cir- 
cuit elements. The user-programmable interconnect 
architecture is used to connect the aforementioned 
circuit elements to one another and to the I/O pins. 5 

User-programmable interconnect architectures 
include a plurality of interconnect conductors which 
may be connected to one another, to inputs and out- 
puts of the various circuit elements, and to the I/O 
pads by user-programmable interconnect elements. w 
These user-programmable interconnect elements 
may take several forms as is known in the art. Exam- 
ples of such elements include antifuses, of which 
there are numerous known examples, such as those 
disclosed in United States Patents Nos. 4,899,205 is 
and 5,070,384, 5,181,096, and pass transistors, such 
as disclosed in the architecture described in United 
States Patent No. 4,870,302. Those of ordinary skill 
in the art will recognize that these examples are non- 
exhaustive and merely illustrate the state of the user- 20 
programmable interconnect element art. Unless spe- 
cifically noted otherwise herein, the meaning of the 
term user-programmable interconnect element as 
used herein shall be construed to cover all forms of 
such interconnect elements. The structure, design, 25 
and use of such user-programmable interconnect ele- 
ments is well known in the art and will not be recited 
herein. 

In FIG. 1, the user-programmable interconnect 
architecture is shown diagrammatically as horizontal 30 
interconnect conductors 22 and vertical interconnect 
conductors 24 which are distributed throughout the 
and among the circuit elements of FIG. 1 . Those of or- 
dinary skill in the art will recognize that FIG. 1 is very 
general in this respect. The lines identified by refer- 35 
ence numerals 22 and 24 in the drawing figure are not 
intended to represent individual interconnect conduc- 
tors but rather represent groups of conductors. An ac- 
tual arrangement of interconnect conductors useful 
for employment in the present invention will be dis- 40 
closed in subsequent figures and text herein. 

In actual embodiments of the architecture of the 
present invention, some of the conductors will be seg- 
mented and some conductors may run the entire 
length or width of the array of circuit elements in the 45 
architecture. Individual user programmable intercon- 
nect elements will be connected between selected 
adjacent segments of the interconnect conductors to 
selectively lengthen them, and other individual user- 
programmable interconnect elements will be posi- 50 
tioned between intersecting horizontal and vertical 
segments of the interconnect conductors. Non-ex- 
haustive examples of the segmenting of individual in- 
terconnect conductors are seen in United States Pa- 
tents 4,870,302, 4,758,745, and 5,073,729. 55 

Those of ordinary skill in the art will understand 
that care must be taken in the design of the segmen- 
tation of the interconnect conductors. Normally, out- 



put signals from an ALU circuit will be passed to a 
nearest neighbor ALU immediately above, below, or 
to either side. However, some circuits (such as reac- 
tive circuits) need to feed the terms back very quickly, 
making short busses necessary. In addition, a signal 
must occasionally be fed a long distance as is 'the 
case of an AGC signal. Long buses will be needed to 
be used for these signals. Fortunately these are often 
slow responding signals and will not limit the circuit 
speed. Those of ordinary skill in the art will under- 
stand that it is preferable that as few user- 
programmable interconnect elements as possible 
should be interposed in a single signal path to mini- 
mize signal delay. 

Although, in a normal circuit configuration de- 
fined by a user, most of the interconnect conductors 
comprising the interconnect architecture are shown 
on the digital side of the circuit, i.e., between the out- 
puts of A/D converters 14-1 and 14-2 and the inputs 
of D/A converters 16-1 and 16-2, there are situations 
where it becomes advantageous to have access to 
the internal interconnect conductor groups from out- . 
side of the integrated circuit. According to one aspect 
of the present invention, the interconnect conductor 
groups may communicate with the I/O pins, either di- 
rectly, as shown in FIG. 1 by leftmost and rightmost 
vertical conductor groups 24 entering I/O block 18, or 
through appropriate input and output buffers as is 
known in the art. This feature of the present invention 
allows a number of integrated circuits according to the 
present invention to be connected together to form 
larger circuits, which may be clocked together as will 
be described further herein. 

Referring now to FIG. 2a, the structure and or- 
ganization of a presently preferred single ALU circuit 
1 2 suitable for use in the architecture of the present 
invention is depicted in block diagram form. ALU 12 
may be configured using standard CMOS building 
blocks for circuits of this type. Those of ordinary skill 
in the art will recognize that other ALU circuits and va- 
riations of the circuit presented in FIG. 2a are useable 
in the present invention. 

According to a presently preferred embodiment 
of the present invention, ALU 12 includes an first 2:1 
multiplexer 26 and a second 2:1 multiplexer 28. Both 
the first and second multiplexers 26 and 28 are n-bits 
wide, where n is the width of the data byte used by the 
ALU. The byte size used in any actual embodiment of 
the invention could be from 2-64 bits wide and will be 
dictated by resolution, size, and other design consid- 
erations. A typical byte size might be, for example, 8 
bits. Practically a data byte would be the width of the 
A/D and D/A converters used. This would be for in- 
stance 8 or 10 bits in the case of Video D/A converters 
and 1 8 bits for Audio D/A converters. 

For some applications however, variations on this 
structure may be necessary. For example, the vol- 
tage in tuned reactive circuits is Q (quality factor) 
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times higher than the input voltage. Typically a Q may 
be as high as 100, which would require an extra 8 bits 
to be added to the ALU circuits 12 to accommodate 
the voltage, resulting in 16 to 18 bits for Video D/A 
converters. If the programmable circuit is optimized 
for reactive circuits, only the internal nodes of the re- 
active circuits need be this size. The rest of the ALU 
data paths could be 8 to 1 0 bits wide. Another solution 
to this problem would be to configure all of the ALU 
circuits 12 to be 8 to 10 bits wide and to program an 
AGC circuit consisting of a peak detector, a compar- 
ator and gain adjust circuit into the circuit to reduce 
the input signal amplitude to the reactive circuit mod- 
ule, thereby preventing the ALU from overflowing. 
Those of ordinary skill in the art will envision numer- 
ous other similar modifications of the basic architec- 
ture of the present invention. 

Referring again to FIG. 2a, the data inputs (A and 
B) of first 2:1 multiplexer 26 is connected to n-wide in- 
put busses 30 and 32, and the data inputs (C and D) 
of second 2: 1 multiplexer 28 is connected to n-wide 
input busses 32 and 34. Numerous other configura- 
tions are possible, but it is preferred that the input 
busses physically exit the ALU 12 in different direc- 
tions to maximize the interconnect possibilities. For 
example, one end of input busses 30. 32, 34, and 36 
might exit the ALU block horizontally and one end may 
exit vertically to permit connection to both horizontal 
and vertical interconnect conductors in the intercon- 
nect matrix of the integrated circuit, thus allowing for 
greater interconnect possibilities. This is shown dia- 
grammatically in FIG. 1 in the region of ALU 12-1 at 
reference numerals 30a and 32a. While only one ALU 
circuit 12-1 is shown having such an input structure 
in FIG. 1, in order to avoid cluttering up the drawing, 
' those of ordinary skill in the art will recognize that it 
is preferable for all ALU circuits to be similarly conf ig- 
ured. 

The control inputs 38 and 40 of first and second 
2:1 multiplexers are brought to an interconnect matrix 
which includes conductor 42 carrying the VCC poten- 
tial for the integrated circuiL conductor 44 carrying 
ground potential, and general interconnect conduc- 
tors 46. 48, and 50. The small circles in the intercon- 
nect matrix at the intersections of control inputs 38 
and 40 and conductors 42, 44, 46. 48. and 50 repre- 
sent user programmable interconnect elements, such 
as antifuses or pass transistors. Those of ordinary 
skill in the art will realize that the arrangement shown 
allows maximum flexibility in that the control inputs 
38 and 40 of the multiplexers can be hardwired to 
VCC or ground to preselect the data source or can be 
hardwired to data sources via one of general intercon- 
nect conductors 46, 48, or 50 to dynamically alter the 
signal sources during circuit operation. 

Those of ordinary skill in the art will recognizer 
that the multiplexed inputs of the ALUs permit greater 
interconnect flexibility. Such persons will also recog- 



nize thaL in some applications, these circuit elements 
will not be necessary. 

The outputs of first and second 2:1 multiplexers 
are directed to negate circuits 52 and 54. The function 

s of negate circuits 52 and 54 is to selectively invert the 
data state of the input, and the circuits may be con- 
figured from exclusive OR gates as is known in the 
art. The control inputs 56 and 58 of negate circuits 52 
and 54 are brought into the interconnect matrix, thus 

10 allowing maximum flexibility of the negate function. 

The outputs of negate circuits 52 and 54 drive the 
Latch A latches 60 and 62. The outputs of Latch A 
latches 60 and 62 form the input terms for adder 64. 
Adder 64 may be a conventional multibit adder circuit. 

ts The output of adder 64 drives the input of latch B 66. 
The output of Latch B 66 is connected to output bus 
68. 

The A Latches 60 and 62 and the B Latch 66 are 
controlled by a control circuit 70. The purpose of con- 
20 trol circuit 70 is to synchronize the operation of the 
ALU to assure that the operation of the circuit is co- 
ordinated with the arrival of the correct data to be 
processed by the ALU. Control circuit 70 has a clock 
(CLK) input 72, an enable (EN) input 74 and an input- 
25 ready in (1NR1N) input 76. These inputs are incorpo- 
rated into an interconnect matrix including two clock 
lines CLKA line 78. CLKB line 80. and three general 
interconnect conductors 82, 84, and 86. The input 
lines are connectable to any of these lines by the user 
30 programmable interconnect elements shown as small 
circles at the intersections of the conductors and the 
input lines. Those of ordinary skill in the art will rec- 
ognize that the connectivity choices shown in FIG. 2a 
are only illustrative, and that the actual choices in an 
35 architecture built in accordance with the teachings of 
the present invention will be dictated largely as a mat- 
ter of design choice. 

Control circuit 70 has four outputs. Output A (line 
88) drives the clocks of the Latch A Latches 62 and 
40 62, and output B (line 90) drives the clock of the Latch 
B 66. INRO'UT line 92 is used for asynchronous con- 
nection of modules and is an input-read output signal 
which would be connected to the input-read (INRIN) 
input of the module connected upstream so that the 
45 upstream module will release data on the next clock. 
DATARDY line 94 is a data ready output used to indi- 
cate that data is valid for the next module downstream 
to read. 

Those of ordinary skill in the art will recognize 
so thaL by using the negate circuits and the multiplex- 
ers, the ALU circuit of FIG. 2a may be configured to 
perform the customary logical functions performed 
by ALU circuits. 

Referring now to FIG. 2b. a state diagram is pre- 
ss sented, showing in detail the operation of the control 
circuit portion of the ALU circuit of FIG. 2a. Those of 
ordinary skill in the art will recognize that synchron-. 
ous stages will not need to utilize the INRIN and INR- 
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OUT lines. Asynchronous stages, however, will use 
the INRIN and INROUT lines at the interface. Occa- 
sional bytes may be lost, but this should not affect the 
overall operation of any circuits configured using the 
architecture of the present invention. Lost bytes may 5 
be averaged out by (A+B)/2 of subsequent data bytes 
until smoothness level is achieved, so long as the 
number of data samples per cycle are adequate. 

Those of ordinary skill in the art will recognize 
that variations of the architecture of the ALU module 10 
of the present invention are possible and are intended 
to fall within the scope of the present invention. For 
example, internal memory could be provided in the 
ALU modules for instructing them to perform more 
than one function and thus increase their flexibility. 15 
However, such skilled persons will recognize that, in 
its limit, such an embodiment will suffer from the Von 
Neuman bottleneck problem of prioi art architec- 
tures. 

The organization of the interconnect architecture 20 
of the present invention makes it possible to utilize the 
interconnect Itself to perform mathematical functions 
such as multiply and divide. This feature of the pres- 
ent invention is advantageous in that such operations 
may be performed in the same clock cycle as the op- 25 
erations performed by the ALU whose output is driv- 
ing the interconnect conductors. 

The speed will be limited by the rate at which the 
ALU circuits can perform an addition (subtraction) 
and a multiplication (division). The multiplication and 30 
division are the mathematical processes that take the 
most time. If however the application circuit is de- 
signed to use circuit elements such as resistors, ca- 
pacitors inductors etc. in units of the power of 2 i.e. 
2, 4, 8, 16 etc., the multiplication and division may be 35 
digitally represented by a shift left or a shift right op- 
eration. 

As previously mentioned, these shifts can be 
built into the interconnect architecture. An exemplary 
scheme for performing such an operation is shown in 40 
FIG. 3. FIG. 3 shows a plurality of horizontal intercon- 
nect conductors 22-1 through 22-5 intersecting a plur- 
ality of vertical interconnect conductors 24-1 through 
24-6. At each intersection, a transistor 56-1 through 
56-36 is connected between the horizontal and vert- 45 
ical interconnect conductors. The gates of diagonally- 
situated ones of the transistors are connected togeth- 
er to one of gate lines 58-1 through 58-11. 

Those of ordinary skill in the art will appreciate 
that transfer of data from conductors 22-1 through 22- 50 
6 to corresponding ones of 24-1 to 24-6 will take place 
when gate line 58-6 is activated. Data can be shifted 
one bit in a first direction in the transfer if gate line 58- 
5 is activated, two bits if gate line 58-4 is activated, 
three bits if gate line 58-3 is activated, and so on. Sim- 55 
Mar shifting will take place a selected number of bits 
in the other direction if one of gate lines 58-7 through 
58-11 is selected. 



Those of ordinary skill in the art will recognize 
that this bit shifting technique can be implemented by 
other user-programmable interconnect devices such 
as antifuses. In such an embodiment, intersecting 
conductive lines may be connected by antifuses and 
the bit shifting to the left or right may be accomplish- 
ed by selective programming of the antifuses. 

Abus interchange like that depicted in FIG. 3 may 
be placed at the intersection of horizontal and vertical 
interconnect conductors such as 22 and 24 and may 
also be employed to connect an input bus or output 
bus of an ALU to the horizontal and vertical intercon- 
nect busses of the interconnect architecture. It is ap- 
parent that the multiplication and division operations 
implemented by the shift function disclosed herein 
will take no significant time, and will certainly occur 
in the same clock cycle used to operate the driving 
ALU. Hence those of ordinary skill in the art will ap- 
preciate that the architecture of the present invention 
can perform functions with the same approximate 
speed as high speed analog operational amplifiers. 

As an example of the use of this technique, an. 
ALU circuit functioning as a digital resistor receives 
two multibit digital values representing the voltages 
across its terminals and outputs a multibit digital val- 
ue representing the current through it by the function 
I = (Va-V b )/R, where R represents its resistance. The 
value of R as any power of 2 may be preprogrammed 
into the ALU circuit by shifting the output bus one or 
more bit positions. This function could be achieved in 
one clock cycle and the digital resistor performs the 
same function on each clock cycle i.e. subtract two in- 
put numbers and divide by a preprogrammed con- 
stant. Hence the architecture of the present invention 
eliminates the need for program storage. Similarly a 
capacitor would be V=Vo+(l/C), where the inputs are 
currents and the output is a voltage. The division op- 
eration for calculating capacitances whose values are 
powers of 2 is automatically performed as a result of 
a bit shift of one or more places in the opposite direc- 
tion from that for a multiplication operation. Similar 
simple functions exist for inductors and transformers 
and operational amplifiers, comparators, ideal di- 
odes, switches or multiplexers, which are the building 
blocks of analog electronics. 

In an integrated circuit of the present invention, 
the user-programmed interconnect of the digital ALU 
circuits would be a one-to-one map of the analog 
equivalent The additional integration of digital sig- 
nals is simple because the digital gates would be 
made of the same type of transistors for digital cir- 
cuits. The digital modules may use similar logic as is 
currently available in Gate arrays, FPGA's and PAL's. 
The interconnection of the analog elements may of 
course be made in the same manner as used in Gate 
arrays, FPGA's and PAL's. 

An integrated circuit according to the present in- 
vention is easily customizable, suitable for mixing 
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analog and digital functions, and can be extremely 
fast, capable of working with analog signals in the RF 
and Video frequency ranges. The limiting frequency 
will likely be the rate of A/D and D/A conversions at 
the boundaries of the system. Rash converters cur- 
rently work in the tens of megahertz. The A/D and D/A 
converters could either be on chip or off chip depend- 
ing on the desire of the designer/manufacturer. 

Referring now to FIGS. 4a and 4b, a simple de- 
sign of an inverting unity gain amplifier is shown as 
an example of the operation of the architecture of the 
present invention. FIG. 4a is a schematic diagram of 
the analog equivalent circuit including two one ohm 
resistors, a 40 nF capacitor, and an amplifier having 
a slew rate of 0.25 V/V, N . FIG. 4b is a block diagram 
of the digital equivalent circuit as implemented in the 
architecture of the present invention. An analog input 
voltage is supplied to AID converter 100, which pres- 
ents its output to ALU 1 02, programmed to behave as 
the resistorRI in the circuit of FIG. 4a. ALU 104 is pro- 
grammed to behave as capacitor C, ALU 106 is pro- 
grammed to behave as resistor R2, and ALU 108 is 
programmed to behave as the amplifier element. The 
entire circuit is driven by a 100 MHz clock 110. ALU 
102 (resistor 1) computes the value I, = (ViM-V^/m. 
ALU 104 (capacitor C) computes the value V, = V 1prew 
+ (I, + 1 2 )(1 0nsec/40nF), where V ip is the voltage from 
the previous clock cycle and 10nsec is the period of 
the clock signal. ALU 106 (resistor 2) computes the 
value l 2 = (Vout - V,)/m. Finally, ALU 108 (the ampli- 
fier) computes the value V 0 ut = (VouTprev + (-0.25)7,. 

If one or (if necessary) both of the PROM ele- 
ments 20-1 or 20-2 are used in place of the feedback 
loop of the amplifier, special non-linear transforms, 
such as logarithmic output of the amplifier, can be im- 
plemented. The data stored at each address is simply 
the log of the address value. Such a variation of the 
amplifier circuit is shown at FIG. 4c. Those of ordinary 
skill in the art will recognize that the log function gen- 
erator 114 may be implemented by use of a ROM look- 
up table. 

FIG. 5 is a graph showing the waveforms of the 
signal input and signal output waveform of the circuit 
for a sinusoidal input waveform. It may be seen from 
FIG. 5 that the output of the amplifier is somewhat 
"phase shifted" due to the pipelining time for the data 
through the ALU system which emulates the analog 
amplifier. 

FIG. 6 is a graph showing the waveforms of the 
signal input and signal output waveform of the circuit 
of FIG. 4b for a square input waveform. The damped 
overshoot characteristic which is typical of analog 
amplifiers may be seen on the output waveform. 

According to another aspect of the present inven- 
tion, re-arranging the architecture of the emulated 
amplifier circuit can eliminate the distortion exhibited 
by the circuit of FIG. 4b which is apparent in FIGS. 5 
and 6. Referring now to FIGS. 7a and 7b, an alternate 



configuration can be configured by employing a slow- 
er master clock and using the data-valid (INR and 
OUTR ) connections of the ALU circuits. For conve- 
nience, the same reference numerals are used in the 

5 circuits of FIGS. 4b and 7b, but the capacitor C has a 
value of 60nF and the amplifier has a gain of 2. 

In the circuit of FIG. 7b, the calculations in the R1 
and R2 ALU circuits (reference numerals 102 and 
106) are performed first Specifically, R1 ALU calcu- 

10 lates I, = (ViN-V^/lfl and R2 ALU calculates l 2 = 
(YouT-VO/m. T ne c ALU (reference numeral 104) 
calculates V, = V 1pre „ + (h+lzKSOnsec/BOnF). This 
ALU is not clocked until the OUTR outputs of R1 and 
R2 ALU circuits 102 and 106 are true, indicating that 

15 the data at their outputs is valid. The AMP ALU 108 
calculates the value V OUT = ^our^-^i and is not 
clocked until the OUTR output of ALU circuit 104 is 
true. 

FIGS. 8 and 9 are graphs showing the input and 
20 output voltages of the circuit of FIG. 7b for 1 MHz sine 
and square wave inputs, respectively. Those of ordi- 
nary skill in the art will recognize that, while the phase 
of the output voltages are lagging the input voltages, 
the square wave output is completely free of over- 
25 shoot. Such skilled persons will also recognize that, 
due to the slower clocking speed (i.e., 33 MHz as op- 
posed to 1 00MHz for the circuit of FIG. 4b), fewer data 
points are used to define the output function. 

Another common problem encountered in analog 
30 electronics is that various parts of a complex signal 
must often be kept in phase while utilizing different 
circuit paths. Atypical example of such a situation is 
in color television where the luminance information is 
routed through a delay line while the chrominance in- 
35 formation is processed. 

The architecture of the present invention may be 
used to implement an analog shift register as shown 
in FIG. 10, thus making possible any length delay 
without phase alteration. In the example of FIG. 10, 
40 three ALU modules 1 20, 1 22 and 1 24 are shown con- 
nected as ah analog shift register. The ALU modules 
are configured to compute the function (V1 + 0)/1 , by 
connecting the B input busses of each to ground and 
the A input busses of each module to the output bus 
45 of the preceding ALU module in the chain. This tech- 
nique may be used to configure an analog shift reg- 
ister chain of arbitrary length, although only three sta- 
ges are shown in FIG. 10. 

The present invention may also be employed to 
so simulate tuned circuits. Tuned circuits have to be de- 
signed with a specific frequency or continuum of fre- 
quencies, not just a factor of a square root of two (f- 
=1/2nsqrtLC), as would be the case were L and C are 
limited to values equal to powers of 2. In digitally emu- 
55 lated tuned circuits according to the present inven- 
tion, the actual value of the circuit element is also a 
function of the frequency at which the circuit is 
clocked. If the number that is output is the value of 
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current, then the time period of the ALU clock signal 
will represent a current multiplied by time. Therefore 
the circuit output value is an amount of charge or Q. 

As an example, a capacitor ALU having a digital 
value 1 , clocked at a clock frequency of 100 MHz will 
be a value of C/clock frequency, or 10 nanofarads. 
Hence the actual value of the circuit elements will be 
set by the clock frequency of the ALU. This feature of 
the invention provides an added advantage in that the 
band pass frequency of a tuned circuit configured ac- 
cording to the present invention will change with the 
clock frequency. Using this feature of the present in- 
vention, applications such as frequency synthesizers 
and spectrum analyzers can be easily implemented. 

Those of ordinary skill in the art will recognize 
that resonant circuits that have fractionally different 
resonant frequencies from one another will employ 
different clock frequencies in the same integrated cir- 
cuit. It is apparent therefore that care must be taken 
to guarantee that the ALU circuits running at different 
frequencies will not read input values from neighbor- 
ing ALU circuits during data transitions and thus read 
undetermined values. 

One technique to avoid this problem in the cir- 
cuits configured according to the present invention is 
to make a small FIFO of, for example, three signal 
bytes. This would require the use of three ALU cir- 
cuits, unless the ALU circuits are optimized to per- 
form this function. The load signal is determined by 
the output of one ALU circuit and the dump signal 
would be determined by the input ALU running at a 
different frequency. If the FIFO is, full one byte is 
erased and the next byte loaded. If the FIFO is empty 
then the last byte is kept for the next read cycle. This 
is of course only one of many ways to perform this 
function. Another possible method is to design the 
ALU circuits with hand shaking such that the waiting 
module will not perform any function on the next clock 
cycle if the adjacent module is not ready to send or re- 
ceive the data. 

Work has been done in the past in an attempt to 
design processor arrays to speed up applications. 
These class of machines are called MIMD or SIMD 
(multiple instruction multiple data orsingle instruction 
multiple data). The MIMD and SIMD machines do not 
use interconnect to perform operations such as mul- 
tiplication and division, and instead utilize the proces- 
sor engines to perform these functions in the tradi- 
tional manner. Nor do they utilize the concept of vary- 
ing the processor clock frequency to vary the calcu- 
lation result, as is employed in the present invention. 
Nor does any of this prior work disclose or suggest the 
idea of programming the interconnect to represent an 
analog function to run in real time. Also the proces- 
sors in these arrays are very complex and are there- 
fore stuck with the Von Neuman bottleneck is an un- 
desirable characteristic thereof. The architecture of 
the present invention by its very nature requires each 



adder/shifter to perform only the one single function 
so there is no data bottle neck. THis provides a sig- 
nificant advantage over the prior art. 

Another advantage of modeling an analog circuit 
5 with an array of adders and shifters with programma- 
ble interconnect is that general integer arithmetic can 
be easily performed by combining adder/shifters. 
Hence the end user can design his device to multiply 
or divide a value by any integer when necessary. 
10 Since analog circuits typically move a signal along a 
circuit path with few feedback terms the additional 
time required for the integer arithmetic may not slow 
down the circuit as this architecture will basically pi- 
peline the calculation so long as the calculation is not 
f5 in a high speed feedback term. 

The architecture of the present invention could 
be implemented in an FPGA but the modules in these 
devices are small and designed for logic functions, 
typically one bit wide. Hence many modules would 
20 have to be used to make a 1 0 bit adder and the inter- 
connect architectures in FPGAdevices do not provide 
a sufficient number of lines to efficiently implement 
the shift function in the interconnect. Hence the cir- 
cuit cost per analog function will be high and the 
25 speed will be much slower. Additionally the modules 
in an FPGA are not designed to accept signals arriv- 
ing asynchronously with the clock signal. 

Circuits that utilize feedback that is running at the 
signal frequency, such as an RLC circuit where the 
30 components interact to form a tuned circuit represent 
the limiting frequency of the performance of this in- 
vention. This is caused by a phase shift between the 
signal and the reaction to that signal which represents 
at best one clock delay. For these applications the cir- 
35 cuit will be more stable if the modules are clocked in 
series rather than in parallel. This of course causes 
the maximum operating frequency of the circuit to be 
limited (divided) by a factor of the number of series 
clock pulse used. Such a clocking scheme is useful 
40 for such applications for the Z transform for the spe- 
cific circuit to be solved and applied to the module ar- 
ray as opposed to the just placing the circuit elements 
one to each module. 

Two examples of a simple series RLC tuned cir- 
45 cuit implemented using the architecture of the pres- 
ent are shown in FIGS. 11a and 11b. Turning first to 
the embodiment of FIG. 11a the straightforward 
placement requires four ALU modules 130, 132, 134, 
and 136, driven by four sequential clocks CLK1, 
50 CLK2, CLK3, and CLK4. The circuit is envisioned as 
an input node impressed with a voltage V, n in series 
with an inductance L in series with a resistance R in 
series with a capacitance C to ground. ALU module 
130, driven by CLK1, computes V|„-V 2prev , where V. 
55 2 P rov is the voltage at the node joining the inductance 
L and resistance R at the last clock cycle. ALU module 
132, driven by CLK2, computes i prev + Ai, where i prev 
is the current through the RLC circuit at the previous 



15 



EP 0 639 816 A2 



16 



clock cycle and Ai is the change in current to the cur- 
rent clock cycle. The current is obtained by dividing 
the output of ALU module 130 by L (as noted in FIG. 
11a). As taught herein, this may be done by the bit 
shifting technique disclosed with reference to FIG. 3 
and accompanying disclosure. 

ALU module 134, driven by CLK3, computes V- 
1 prev + i/C, where Vl^ is the voltage at the node con- 
necting the resistance R to the capacitance C at the 
previous clock cycle and i/C is simply the current i 
(output of ALU module 132 divided by the capacitance 
C (as noted in FIG. 11a) by the bit shifting technique. 
ALU module 136, driven by CLK4, computes V1 + iR. 
where V1 is the voltage at the node connecting the re- 
sistance R to the capacitance C at the current clock 
cycle and iR is simply the current i (output of ALU 
module 132 multiplied by the resistance R (as noted 
in FIG. 11a) by the bit shifting technique. 

As shown in FIG. 11b, judicious placement of the 
Z transform reduces the number of clocks to two and 
increases the number of ALU modules to five. The im- 
plementation of FIG. 11b doubles the maximum fre- 
quency. In this sense the present invention can be im- 
agined as a parallel programmable Z transform. 

In the embodiment of FIG. 11b, the input voltage 
V,„ is applied to ALU module 140, driven by CLK1, 
which computes X=V, n - V c , where V c is the voltage 
across capacitance C in the present cycle. ALU mod- 
ule 142, driven by CLK2, computes the function Y=(X 
- IprevRVU where X is the result of the calculation of 
ALU module 140, l prev is the current of the previous cy- 
cle, and R and L are the resistance and inductance, 
respectively. ALU module 144, driven by CLK1, com- 
putes Z =l prev - Iprev/LC, where L and C are inductance 
and capacitance C, respectively. ALU module 146, 
driven by CLK2, computes the function l=Y + Z, where 
I is the current of the present cycle, and Y and Z are 
the results of the last calculations made by ALU mod- 
ules 142 and 144, respectively. ALU module 148. driv- 
en by CLK1, computes V c , the voltage across the ca- 
pacitance in the present cycle, as V cprm . the voltage 
across the capacitance C in the previous cycle, minus 
the quantity \ prev IC. 

Those of ordinary skill in the art will recognize 
that The term IR at one input to ALU module 142 may 
be obtained bythe bitshifting techniques taught here- 
in. Similarly, the terms \p,JC of ALU module 148 and 
Iprev/LC at the input to ALU module 144 may be simi- 
larly obtained. While this bit shifting multiply and div- 
ide technique does allow use of a minimal number of 
ALU modules, those of ordinary skill in the art will rec- 
ognize that the values of the multiplicands and divi- 
sors are limited to integers which are powers of 2, i.e., 
2... 4... 8... 16 etc. Such skilled persons will recognize 
that divider and multiplier circuits may be configured 
from multiple Alu modules to provide more flexibility 
of component value choices at the expense of greater 
circuit complexity and ALU utilization. 



Another nice feature of the present invention is 
that many circuit elements normally employed in ana- 
log circuits can be eliminated since no biasing, impe- 
dance matching, or buffering is necessary. A double 

5 balanced mixer configured using the architecture of 
the present invention requires only one module to 
perform the function IV1 +V2I/2. The module is pro- 
grammed to add the two numbers and if the most sig- 
nificant bit is negative (signed integer) then perform 

10 thetwo'scomplementthatthe module would normally 
due for a subtract The divide by two is done on the 
output to the interconnect. Hence three coupling 
transformers, two diodes and an amplifier are mod- 
eled by one module. 

is Variations in gain for circuits such as AGC circuits 

can be implemented as powers of two by designing 
the module interconnect with transistors that can be 
switched in the circuit, as opposed to hard wired in- 
terconnect with antifuses. Another method of varying 

20 gain would be to provide a resistor divider program- 
med into the modules wherein the resistor value is set 
in SRAM memory in the module that can be changed 
on the fly. 

Sine wave oscillators are made with this architec- 
25 ture with only two clocks, one representing the L and 
one the C. Since these devices are mathematical 
there is no series resistance and therefore no damp- 
ing of the oscillation. Hence the oscillator, once start- 
ed, runs forever. By setting its initial conditions, the 
30 phase and amplitude are determined for every cycle 
until reset. Phase locked loops are therefore simple 
to implement. An excellent application would be syn- 
chronizing a 3.58 MHz oscillator to the color burst sig- 
nal of a NTSC (TV) signal for decoding the color in- 
35 formation. The clock frequency will change the oscil- 
lator frequency and the amplitude can be loaded at 
any time to synchronize with the input signal. 

As mentioned earlier this technology can com- 
bine analog and digital functions with ease. An exam- 
40 pie would be the combining of a digital phase locked 
loop to generate the various clock frequencies re- 
quired to run the different circuit blocks. This would 
reduce the need to input these signals from off chip 
and therefore increase speed and reduce pin count 
45 and power consumption. 

Another feature of this architecture is that once 
the signal is digitized a more complex system can be 
built by merely adding more chips. These would be 
designed such the all the digital outputs for a signal 
50 are adjacent and would match up to the inputs of an- 
other chip, allowing communicating pins from to chips 
to be placed side by side. Lead lengths and capaci- 
tance loading are therefore minimized, allowing com- 
munication of the signal from one chip to the next at 
55 the maximum possible frequency. The signal need 
not be converted back to analog until necessary to re- 
turn the signal to the real world (i.e. speaker orvideO' 
monitor). Of course if the information goes to a com- 

9 
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puter then the signal need never be converted back 
to analog. 

The modules may be designed with gated inputs 
to control the time a signal is loaded as is the case in 
synchronizing a signal or to steer the input as is the 5 
case with multiplexers. 

If desired this architecture could integrate integer 
divide and multiply in the modules to perform the cal- 
culations thereby eliminating the requirement of using 
component values of a power of two. Clock f requen- 10 
cies would not therefore need to be fractionally differ- 
ent. This of course would lowerthe speed and density 
of the chip, but it will still be considerably faster the 
conventional DSP chips as there still would be no Von 
Neuman bottleneck. J5 

Some chips could be specialized by designing 
more specialized modules, optimized for special ap- 
plications. For example, such a module; could be opti- 
mized for the series RLC circuit example disclosed 
herein and could speed up the maximum chip oper- 20 
ating frequency by about a factor of two. 

From the above description, those of ordinary 
skill in the art will recognize that a field programmable 
version of the architecture of the present invention 
could be used to make prototype circuits, and that 25 
mask programmable versions of the present architec- 
ture could be used in a production environment Such 
mask programmable versions fall within the scope of 
the present invention. 

While embodiments and applications of this in- 30 
vention have been shown and described, it would be 
apparent to those skilled in the art that many more 
modifications than mentioned above are possible 
without departing from the inventive concepts herein. 
The invention, therefore, is not to be restricted except 55 
in the spirit of the appended claims. 



interconnect means for connecting select- 
ed ones of said interconnect conductors to at 
least one other interconnect conductor, for con- 
necting selected ones of said interconnect con- 
ductors to said inputs of said ALU circuits, for 
connecting selected ones of said interconnect 
conductors to said outputs of said ALU circuits, 
for connecting selected ones of said interconnect 
conductors to said digital outputs of said at least 
one analog to digital converter, for connecting se- 
lected ones of said interconnect conductors to 
said digital inputs of said at least one digital to 
analog converter, for connecting selected ones of 
said inputs and outputs of said ALU circuits to 
one another, for connecting said input/output 
pads to said analog input of said at least one ana- 
log to digital converter, and for connecting said in- 
put/output pads to said analog output of said at 
least one digital to analog converter, at least 
some of said interconnect means being user pro- 
grammable. 

2. The field programmable, digital signal processing ' 
integrated circuit of claim 1, further including: 

at least one PROM circuit disposed in said 
integrated circuit, said PROM including a plurality 
of address input lines and a plurality of data out- 
put lines; 

interconnect means for connecting select- 
ed ones of said interconnect conductors to said 
plurality of address input lines and said plurality 
of data output lines of said at least one PROM cir- 
cuit. 



Claims 

40 

1. A field programmable, digital signal processing 
integrated circuit, comprising: 

a plurality of input/output pads; 

at least one analog to digital converter dis- 
posed in said integrated circuit, said at least one 45 
analog to digital converter having an analog input 
and a plurality of digital outputs: 

at least one digital to analog converter dis- 
posed in said integrated circuit, said at least one 
digital to analog converter having a plurality of 50 
digital inputs and an analog output; 

a plurality of ALU circuits disposed in the 
integrated circuit, each of said ALU circuits hav- 
ing inputs and outputs; 

means for individually defining the opera- 55 
tion to be performed by each of said ALU circuits; 

a plurality of interconnect conductors in 
the integrated circuit; and 
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(g) Field programmable digital signal processing array integrated circuit 



(57) A field programmable, digital signal proces- 
sing integrated circuit is formed in a semicon- 
ductor die and includes an array of arithmetic 
logic (ALU) circuits. A user programmable inter- 
connect architecture is superimposed on the 
array of ALU circuits. One or more interface 
circuits comprising digital-to-analog (D/A) con- 
verters or analog-to-digital (A/D) converters are 
provided on the integrated circuit to interface to 
off-chip analog input signals and provide off- 
chip analog output signals. Circuitry is provided 
to program the interconnections between the 
interface circuits and the ALU circuits and be- 
tween individual ones of the ALU circuits, as 
well as to define the specific functions of the 
individual ALU circuits. 
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